|
Short text classification using latent Dirichlet allocation
ZHANG Zhifei MIAO Duoqian GAO Can
Journal of Computer Applications
2013, 33 (06):
1587-1590.
DOI: 10.3724/SP.J.1087.2013.01587
In order to solve the two key problems of the short text classification, very sparse features and strong context dependency, a new method based on latent Dirichlet allocation was proposed. The generated topics not only discriminate contexts of common words and decrease their weights, but also reduce sparsity by connecting distinguishing words and increase their weights. In addition, a short text dataset was constructed by crawling titles of Netease pages. Experiments were done by classifying these short titles using K-nearest neighbors. The proposed method outperforms vector space model and topic-based similarity.
Reference |
Related Articles |
Metrics
|
|